Patent Mining: A Baseline Approach

نویسندگان

  • Fredric C. Gey
  • Ray R. Larson
چکیده

For NTCIR Workshop 7 UC Berkeley participated in both IR4QA and the Patent Mining Tasks. This paper summarizes our approach to Patent Mining. Our focus was upon the US Patent collection, and our methodology was to treat patent mining as an information retrieval task and to aggregate multiple patent classifications from retrieved patent documents. The performance was relatively poor, possibly because of retrieving too many documents, or because of nonutilization of blind feedback techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Development of a Patent Matching System Using a Hybrid Approach

There were many researches about applying various data mining or text mining tools to patent analysis, and there were many scholars and experts have verified the accuracy and the feasibility of those tools. However, since mining tools always tried to analyze the content using some mathematic methodology, such as linguistic algorithms, they neglect the fact that patent records are combinations o...

متن کامل

An Automated Research Paper Classification Method for the IPC system with the Concept Base

In the present paper, a classification method using the Concept Base is proposed and evaluated in the Patent Mining Task of the NTCIR-7 workshop. In this task, research papers are classified into the International Patent Classification (IPC) system. The classification enables research papers to be located on a patent map. In order to classify a paper, patent documents that are similar to the pa...

متن کامل

Corporate Decision Making with Self-Organizing Patent Maps Labeled by Technical Terms and AHP

In this paper, we propose an approach for corporate decision making with self-organizing patent maps labeled by technical terms and AHP. First, we select the patent area of interest and collect pertinent patent documents in text format. Second, we extract keywords by text mining to transform patent documents into feature vectors of the companies. Third, we input the feature matrix of technical ...

متن کامل

Multi-label Classification using Logistic Regression Models for NTCIR-7 Patent Mining Task

We design a multi-label classification system based on a machine learning approach for the NTCIR-7 Patent Mining Task. In our system, we employ a logistic regression model for each International Patent Classification (IPC) code that determines the IPC code assignment of research papers. The logistic regression models are trained by using patent documents provided by task organizers. To mitigate...

متن کامل

Constructing a Broad-coverage Lexicon for Text Mining in the Patent Domain

For mining intellectual property texts (patents), a broad-coverage lexicon that covers general English words together with terminology from the patent domain is indispensable. The patent domain is very diffuse as it comprises a variety of technical domains (e.g. Human Necessities, Chemistry & Metallurgy and Physics in the International Patent Classification). As a result, collecting a lexicon t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008